Task Node Description | supOS Industrial Operating System

Version: 4.0

info

Variables used in operators need to be configured in advance during task development.

Data Input Node

Data Input

supOS instance

Select template, instance, attribute from the supOS you added.
Enable History to add attribute history information such as alias, value and time as output fields.

File data source

HDFS is not supported.
Select File match under File selection method, and you can use wildcard * with suffix to match the file name. For example, test*.xlsx.
Select File upload under File selection method, and you need to save the file in UTF-8 encoding if the file contains Chinese characters.

info

When the input data source is a file, the published task reads the file and output results again if the file changes.

API data source

Item	Description
Data source type	WebService: Automatically match output fields. Variables can be configured in XML text with the '#' symbol as an identifier, and the variable must exist in the task global variables. RESTFul: Automatically match output fields. Variables can be configured in the request URL, request headers, and JSON text, and the variable must exist in the task global variables.
Request address	Enter the API request address and use #{} to identify variables.
Request header param	Set the header parameter of the API. When setting Variable as the Processing method, you can select the value from global variables.

Message Queue Input

Collects data from message queue and maps it to a 2-demension table, and delivers to downstream operators for processing.

Item	Description
Data source type	Only Kafka and RocketMQ are available.
Topic	Select the corresponding topic of the data to be added. You can click Data preview to view the data.
Field settings	Click New field to add fields corresponding to output information.
Initial offset	earliest: During each scheduling, if there is no new submitted offset, synchronization starts from the beginning. If there is a new submitted offset, consumption starts from the newly submitted offset. latest: When there is no submitted offset, consumption will start from the currently submitted data and will not consume data that was generated in the past.

Data Warehouse Input

Supports tables and views on ODS, DW and local DW message.

info

Install X-DAM in advance. Otherwise, Local data warehouse will not be available.

Item	Description
Data source type	Presently, only Local data warehouse is available with X-DAM installed.
Data warehouse level	Select the level of the data in data warehouse.
Data sheet	Select the data sheet.

Data Processing Node

Data Quality

QC Rule	Description
Null check	Select a field, check whether the field has null values, and select from Discard, Fix and Ignore as the subsequent operation.
Range check	Select a field, set a value range to filter the field and leave values only within the set range.
Data format check	Select a field, and use the embedded or custom expression to verify its data format.
Enumeration value check	Select a field, and set certain values to check whether the field value is one of the enumerated values.

SQL Execution

Uses SQL statements to perform simple operations such as insert, delete and update on the relational data sources.

Data Set

Renames data fields or configures field mapping to generate a new dataset, mainly used for redefining the data structure set during data processing.

Data Filtering

Multiple filter conditions are available for a single field, and data that meets all conditions will be filtered.

in/not in: For numeric fields. Value is an array of numbers and use comma to separate. in means the field value equals to one of the numbers and not in means equals to none.
like/not like: For text type of fields. like matches part or all of the text and not like matches none.
between/not between: For date type of fields. between means the date is within the set range and not between means not.

Data Connection

Integrates data from multiple tables into one.

Drag 2 data input operators and configure them with 2 relational data tables.
Drag the Data connection operator and connect both data sources to it.
Double-click Data connection, and then configure its connection relation.

Select data sources, and then click to select the join relation.
Click the data source to be matched.
Click New to add the match field.

Data Sorting

info

The sorting priority goes down along with the field position on the list.

Data Merging

Merges data from multiple data sources. You can select columns from multiple sample sets and merge them into a data set.

Drag 2 data input operators and configure them with 2 relational data tables.
Drag the Data merging operator and connect both data sources to it.
Double-click Data merging, and then configure its merging relation.

Click New to add a merged field.
Enter the merged field name, and then select field type and fields from both data source to be merged.

Missing Value Handling

Replaces null and empty strings in the data source to make sure subsequent data processing such as feature query, modeling, goes smoothly.

Replaces the missing value with the maximum, minimum, average value of the column, or a constant value, global variable or linear fitting value.
Use custom formula to generate values as replacement. Only +, -, *, / are supported.

Data Type Conversion

Converts the type of input fields.

info

If the conversion is not legal, 0 is generated.

Data Completion

Item	Description
Data completion time granularity	Complete data columns based on the set granularity. For example, set the granularity to Second, the complete the column where seconds are missing.
Completion base time field	Select a time field to be completed.
Start data	The data column where the completion starts.
End data	The data column where the completion ends.

Variable Settings

Gives new value to global variables you set during task development. The new value can be either manually set or fields from the input data source.

Data Aggregation

Aggregate data based on the set group fields.

Select fields, and then click Add group field. The aggregated data will be grouped by the set fields.
Select fields of numeric type, and then click Add aggregated field to add fields to be aggregated.
Set Aggregation cycle type.

Item	Description
Aggregation cycle type	None: Aggregate all data based on the group fields. Time: Aggregate grouped data by time. For example, set the cycle to 5 minutes, and then every 5 minutes, run the aggregation. Data record: Aggregate grouped data by number of data record. For example, set the cycle to 60 data records, and then every 60 records, run the aggregation.
Aggregate base time field/Sort field	Select a base sorting field.
Aggregation cycle	Set the time or data record by which the aggregation is calculated.
Record time filling method/Record entry method	Set the output value, which can be either the initial value or end value of each cycle.
Sliding window	Only available when setting Aggregation cycle type to Time.
Additional output data record field	Only available when setting Aggregation cycle type to Data record. Select an additional field for output.

Aggregate by time Aggregate by data record

Table Transposition

info

Only available for file type data, and output to file type of data.

Custom Operator

You can customize algorithm and its functions.

info

Presently, only original JaveScript is available.

JavaScript rules:

The input parameter of the JS script is a collection of data passed down from the previous node, which can be a single or multiple entries.
The definition of the data collection in the script is "dataList".
To obtain the value of a property, use the dot notation followed by the property name (e.g., ".id").
The JS script should return the corresponding object or data collection after performing the required operation in the form of a JSON array or string.
The variables configured on the page can be directly used in the JS script (just ensure that the variable names match).
Debugging logs can be added to the JS script using the "log.debug()" function, and the debugging results can be viewed at the bottom of the script.
It is possible to use Java-related collection classes for operations within the JS script.

Script example

var arr = [];
for (var i=0; i<dataList.length; i++) {
    dataList[i].id = dataList[i].id + 10;
    arr[i] = dataList[i];
    }
    arr;

caution

When calling a function with an input object, pay attention to null values that may cause a NullPointerException.
When performing operations on input fields, be aware of the object types of the fields (comparing time objects may result in different dates appearing equal, so it is recommended to convert them to strings or primitive types before operating on them).
When performing calculations on numerical values, the data type may change unexpectedly.
In a JS script, custom variables cannot have the same name as global variables configured during task development.

After writing the script, click Data simulation to debug the data. You can add up to 50 records.
Debug the script to make sure the result is reasonable.

Big Data Model

Model Operation

Runs big data model of supOS and output the results.

Model Update

Select big data models that can be updated, and update them automatically and output the results.

Generate Sample Set

Generates sample sets data and synchronizes to X-BD sample set management.

info

No downstream operators.
String type of data in sample set cannot contain English comma ,.

Process Control

Based on the set conditional statements, branch tasks that meet the conditions are executed.

Connect multiple operators (up to five branches) after the branch task node, and then double-click the branch task to configure its parameters.
Click Settings corresponding to each subsequent operator, and set their conditions.
info
Make sure the result is boolean type of data, and only condition returns True, will the branch task be executed.

Data Desensitization

By using data desensitization to process multiple fields, the desensitized data will overwrite the original data and flow to downstream operators.

Data Output

Configurations are similar to the input data source. For details, see Data Input Node.

supOS instance

File data source

API data source

Message Queue Output

Kafka, mQTT and RocketMQ are supported.

Data Warehouse Output

HBASE Output

info

HBase data source only receives string type of data.
rowKey, as the primary key for an HBase data source, a mapping relationship must be established for it. If there are duplicate values in the data source mapped to it, only the latest row of data will be retained.

Data Input Node​

Data Input​

Message Queue Input​

Data Warehouse Input​

Data Processing Node​

Data Quality​

SQL Execution​

Data Set​

Data Filtering​

Data Connection​

Data Sorting​

Data Merging​

Missing Value Handling​

Data Type Conversion​

Data Completion​

Variable Settings​

Data Aggregation​

Table Transposition​

Custom Operator​

Big Data Model​

Model Operation​

Model Update​

Generate Sample Set​

Process Control​

Data Desensitization​

Data Output​

Data Output​

Message Queue Output​

Data Warehouse Output​

HBASE Output​

Data Input Node

Data Input

Message Queue Input

Data Warehouse Input

Data Processing Node

Data Quality

SQL Execution

Data Set

Data Filtering

Data Connection

Data Sorting

Data Merging

Missing Value Handling

Data Type Conversion

Data Completion

Variable Settings

Data Aggregation

Table Transposition

Custom Operator

Big Data Model

Model Operation

Model Update

Generate Sample Set

Process Control

Data Desensitization

Data Output

Data Output

Message Queue Output

Data Warehouse Output

HBASE Output